Personalized Spam Filtering for Gray Mail
نویسندگان
چکیده
Gray mail, messages that could reasonably be considered either spam or good by different email users, is a commonly observed issue in production spam filtering systems. In this paper we study this class of mail using a large real-world email corpus and signaturebased campaign detection techniques. Our analysis shows that even an optimal filter will inevitably perform unsatisfactorily on gray mail, unless user preferences are taken into account. To overcome this difficulty we design a light-weight user model that is highly scalable and can be easily combined with a traditional global spam filter. Our approach is able to incorporate both partial and complete user feedback on message labels and catches up to 40% more spam from gray mail in the low false-positive region.
منابع مشابه
Personalized E-mail Filtering System Based on Usage Control
In order to cope with the problem of spam soaring, a personalized e-mail filtering method based on UCON is proposed. E-mails from different senders were classified as junk e-mail, suspicious e-mail and normal email by trust third-party according to the maintained blacklist and embedded machine learning technology online. Suspicious e-mails will be classified further from users’ point of view ma...
متن کاملCombining Global and Personal Anti-Spam Filtering
Many of the first successful applications of statistical learning to anti-spam filtering were personalized classifiers that were trained on an individual user’s spam and ham e-mail. Proponents of personalized filters argue that statistical text learning is effective because it can identify the unique aspects of each individual’s e-mail. On the other hand, a single classifier learned for a large...
متن کاملSpamCooling: A Parallel Heterogeneous Ensemble Spam Filtering System Based on Active Learning Techniques
Anti-spam technology is developing rapidly in recent years. With the emerging applications of machine learning in diverse fields, researchers as well as manufacturers around the world have attempted a large number of related algorithms to prevent spam. In this paper, we designed an effective anti-spam protection system, SpamCooling, based on the mechanism of active learning and parallel heterog...
متن کاملAn E-mail Authentication and Disposable Addressing Scheme for Filtering Spam
The number of spam mails has spread rapidly in recent years. Currently, the most common spam filtering solutions include blacklisting and content filtering, as well as the Bayesian approach, which uses a Bayesian filter to analyze mail content to generate classifiers. However, spammers can forge their addresses or include additional information that will mislead the filtering system or mark leg...
متن کاملTowards Symbiotic Spam E-mail Filtering
This position paper discusses the use of symbiotic filtering, a novel distributed data mining approach that combines contentbased and collaborative filtering for spam detection.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008